Parallel solver for shifted systems in a hybrid CPU-GPU framework
نویسندگان
چکیده
This paper proposes a combination of a hybrid CPU–GPU and a pure GPU software implementation of a direct algorithm for solving shifted linear systems (A− σI)X = B with large number of complex shifts σ and multiple right-hand sides. Such problems often appear e.g. in control theory when evaluating the transfer function, or as a part of an algorithm performing interpolatory model reduction, as well as when computing pseudospectra and structured pseudospectra, or solving large linear systems of ordinary differential equations. The proposed algorithm first jointly reduces the general full n × n matrix A and the n×m full right-hand side matrix B to the controller Hessenberg canonical form that facilitates efficient solution: A is transformed to a so-called m-Hessenberg form and B is made uppertriangular. This is implemented as blocked highly parallel CPU–GPU hybrid algorithm; individual blocks are reduced by the CPU, and the necessary updates of the rest of the matrix are split among the cores of the CPU and the GPU. To enhance parallelization, the reduction and the updates are overlapped. In the next phase, the reduced m-Hessenberg–triangular systems are solved entirely on the GPU, with shifts divided into batches. The benefits of such load distribution are demonstrated by numerical experiments. In particular, we show that our proposed implementation provides an excellent basis for efficient implementations of computational methods in systems and control theory, from evaluation of transfer function to the interpolatory model reduction.
منابع مشابه
Fast Cellular Automata Implementation on Graphic Processor Unit (GPU) for Salt and Pepper Noise Removal
Noise removal operation is commonly applied as pre-processing step before subsequent image processing tasks due to the occurrence of noise during acquisition or transmission process. A common problem in imaging systems by using CMOS or CCD sensors is appearance of the salt and pepper noise. This paper presents Cellular Automata (CA) framework for noise removal of distorted image by the salt an...
متن کاملSolvers on advanced parallel architectures with application to partial differential equations and discrete optimisation
This thesis investigates techniques for the solution of partial differential equations (PDE) on advanced parallel architectures comprising central processing units (CPU) and graphics processing units (GPU). Many physical phenomena studied by scientists and engineers aremodelled with PDEs, and these are often computationally expensive to solve. This is one of the main drivers of large-scale comp...
متن کاملA CPU-GPU hybrid approach for the unsymmetric multifrontal method
Multifrontal is an efficient direct method for solving large-scale sparse and unsymmetric linear systems. The method transforms a large sparse matrix factorization process into a sequence of factorizations involving smaller dense frontal matrices. Some of these dense operations can be accelerated by using a graphic processing unit (GPU). We analyze the unsymmetricmultifrontalmethod fromboth an ...
متن کاملA Hybrid Parallel Spatial Interpolation Algorithm for Massive LiDAR Point Clouds on Heterogeneous CPU-GPU Systems
Nowadays, heterogeneous CPU-GPU systems have become ubiquitous, but current parallel spatial interpolation (SI) algorithms exploit only one type of processing unit, and thus result in a waste of parallel resources. To address this problem, a hybrid parallel SI algorithm based on a thin plate spline is proposed to integrate both the CPU and GPU to further accelerate the processing of massive LiD...
متن کامل3D Helmholtz Krylov Solver Preconditioned by a Shifted Laplace Multigrid Method on Multi-GPUs
We are focusing on an iterative solver for the three-dimensional Helmholtz equation on multi-GPU using CUDA (Compute Unified Device Architecture). The Helmholtz equation discretized by a second order finite difference scheme is solved with Bi-CGSTAB preconditioned by a shifted Laplace multigrid method. Two multi-GPU approaches are considered: data parallelism and split of the algorithm. Their i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1708.06290 شماره
صفحات -
تاریخ انتشار 2017